Goto

Collaborating Authors

 truth data


Variational Autoencoder for Calibration: A New Approach

arXiv.org Artificial Intelligence

In this paper we present a new implementation of a Variational Autoencoder (VAE) for the calibration of sensors. We propose that the VAE can be used to calibrate sensor data by training the latent space as a calibration output. We discuss this new approach and show a proof-of-concept using an existing multi-sensor gas dataset. We show the performance of the proposed calibration VAE and found that it was capable of performing as calibration model while performing as an autoencoder simultaneously. Additionally, these models have shown that they are capable of creating statistically similar outputs from both the calibration output as well as the reconstruction output to their respective truth data. We then discuss the methods of future testing and planned expansion of this work.


SPICE-HL3: Single-Photon, Inertial, and Stereo Camera dataset for Exploration of High-Latitude Lunar Landscapes

arXiv.org Artificial Intelligence

Exploring high-latitude lunar regions presents an extremely challenging visual environment for robots. The low sunlight elevation angle and minimal light scattering result in a visual field dominated by a high dynamic range featuring long, dynamic shadows. Reproducing these conditions on Earth requires sophisticated simulators and specialized facilities. We introduce a unique dataset recorded at the LunaLab from the SnT - University of Luxembourg, an indoor test facility designed to replicate the optical characteristics of multiple lunar latitudes. Our dataset includes images, inertial measurements, and wheel odometry data from robots navigating seven distinct trajectories under multiple illumination scenarios, simulating high-latitude lunar conditions from dawn to night time with and without the aid of headlights, resulting in 88 distinct sequences containing a total of 1.3M images. Data was captured using a stereo RGB-inertial sensor, a monocular monochrome camera, and for the first time, a novel single-photon avalanche diode (SPAD) camera. We recorded both static and dynamic image sequences, with robots navigating at slow (5 cm/s) and fast (50 cm/s) speeds. All data is calibrated, synchronized, and timestamped, providing a valuable resource for validating perception tasks from vision-based autonomous navigation to scientific imaging for future lunar missions targeting high-latitude regions or those intended for robots operating across perceptually degraded environments. The dataset can be downloaded from https://zenodo.org/records/13970078?preview=1, and a visual overview is available at https://youtu.be/d7sPeO50_2I. All supplementary material can be found at https://github.com/spaceuma/spice-hl3.


Statistical Study of Sensor Data and Investigation of ML-based Calibration Algorithms for Inexpensive Sensor Modules: Experiments from Cape Point

arXiv.org Artificial Intelligence

In this paper we present the statistical analysis of data from inexpensive sensors. We also present the performance of machine learning algorithms when used for automatic calibration such sensors. In this we have used low-cost Non-Dispersive Infrared CO$_2$ sensor placed at a co-located site at Cape Point, South Africa (maintained by Weather South Africa). The collected low-cost sensor data and site truth data are investigated and compared. We compare and investigate the performance of Random Forest Regression, Support Vector Regression, 1D Convolutional Neural Network and 1D-CNN Long Short-Term Memory Network models as a method for automatic calibration and the statistical properties of these model predictions. In addition, we also investigate the drift in performance of these algorithms with time.


A Novel Score-CAM based Denoiser for Spectrographic Signature Extraction without Ground Truth

arXiv.org Artificial Intelligence

Sonar based audio classification techniques are a growing area of research in the field of underwater acoustics. Usually, underwater noise picked up by passive sonar transducers contains all types of signals that travel through the ocean and is transformed into spectrographic images. As a result, the corresponding spectrograms intended to display the temporal-frequency data of a certain object often include the tonal regions of abundant extraneous noise that can effectively interfere with a 'contact'. So, a majority of spectrographic samples extracted from underwater audio signals are rendered unusable due to their clutter and lack the required indistinguishability between different objects. With limited clean true data for supervised training, creating classification models for these audio signals is severely bottlenecked. This paper derives several new techniques to combat this problem by developing a novel Score-CAM based denoiser to extract an object's signature from noisy spectrographic data without being given any ground truth data. In particular, this paper proposes a novel generative adversarial network architecture for learning and producing spectrographic training data in similar distributions to low-feature spectrogram inputs. In addition, this paper also a generalizable class activation mapping based denoiser for different distributions of acoustic data, even real-world data distributions. Utilizing these novel architectures and proposed denoising techniques, these experiments demonstrate state-of-the-art noise reduction accuracy and improved classification accuracy than current audio classification standards. As such, this approach has applications not only to audio data but for countless data distributions used all around the world for machine learning.


ILLUME: Rationalizing Vision-Language Models through Human Interactions

arXiv.org Artificial Intelligence

Bootstrapping from pre-trained language models has been proven to be an efficient approach for building vision-language models (VLM) for tasks such as image captioning or visual question answering. However, outputs of these models rarely align with user's rationales for specific answers. In order to improve this alignment and reinforce commonsense reasons, we propose a tuning paradigm based on human interactions with machine-generated data. Our ILLUME executes the following loop: Given an image-question-answer prompt, the VLM samples multiple candidate rationales, and a human critic provides feedback via preference selection, used for fine-tuning. This loop increases the training data and gradually carves out the VLM's rationalization capabilities that are aligned with human intent. Our exhaustive experiments demonstrate that ILLUME is competitive with standard supervised finetuning while using significantly fewer training data and only requiring minimal feedback.


Sim2Real Docs: Domain Randomization for Documents in Natural Scenes using Ray-traced Rendering

arXiv.org Artificial Intelligence

In the past, computer vision systems for digitized documents could rely on systematically captured, high-quality scans. Today, transactions involving digital documents are more likely to start as mobile phone photo uploads taken by non-professionals. As such, computer vision for document automation must now account for documents captured in natural scene contexts. An additional challenge is that task objectives for document processing can be highly use-case specific, which makes publicly-available datasets limited in their utility, while manual data labeling is also costly and poorly translates between use cases. To address these issues we created Sim2Real Docs - a framework for synthesizing datasets and performing domain randomization of documents in natural scenes. Sim2Real Docs enables programmatic 3D rendering of documents using Blender, an open source tool for 3D modeling and ray-traced rendering. By using rendering that simulates physical interactions of light, geometry, camera, and background, we synthesize datasets of documents in a natural scene context. Each render is paired with use-case specific ground truth data specifying latent characteristics of interest, producing unlimited fit-for-task training data. The role of machine learning models is then to solve the inverse problem posed by the rendering pipeline. Such models can be further iterated upon with real-world data by either fine tuning or making adjustments to domain randomization parameters.


Computer vision in AI: The data needed to succeed

#artificialintelligence

Developing the capacity to annotate massive volumes of data while maintaining quality is a function of the model development lifecycle that enterprises often underestimate. It's resource intensive and requires specialized expertise. At the heart of any successful machine learning/artificial intelligence (ML/AI) initiative is a commitment to high-quality training data and a pathway to quality data that is proven and well-defined. Without this quality data pipeline, the initiative is doomed to fail. Computer vision or data science teams often turn to external partners to develop their data training pipeline, and these partnerships drive model performance.


Automatic trajectory recognition in Active Target Time Projection Chambers data by means of hierarchical clustering

arXiv.org Machine Learning

The automatic reconstruction of three-dimensional particle tracks from Active Target Time Projection Chambers data can be a challenging task, especially in the presence of noise. In this article, we propose a nonparametric algorithm that is based on the idea of clustering point triplets instead of the original points. We define an appropriate distance measure on point triplets and then apply a single-link hierarchical clustering on the triplets. Compared to parametric approaches like RANSAC or the Hough transform, the new algorithm has the advantage of potentially finding trajectories even of shapes that are not known beforehand. This feature is particularly important in low-energy nuclear physics experiments with AT operating inside a magnetic field. The algorithm has been validated using data from experiments performed with the Active Target Time Projection Chamber (AT-TPC) at the National Superconducting Cyclotron Laboratory (NSCL).The results demonstrate the capability of the algorithm to identify and isolate particle tracks that describe non-analytical trajectories. For curved tracks, the vertex detection recall was 86% and the precision 94%. For straight tracks, the vertex detection recall was 96% and the precision 98%. In the case of a test set containing only straight linear tracks, the algorithm performed better than an iterative Hough transform. Keywords: Time Projection Chambers, Active Target, Pattern Recognition, Clustering 1. Introduction One of the present aims of modern low-energy nuclear physics is to provide a more complete understanding about the behavior of subatomic matter under large isospin (i.e.


Automating automation: Machine learning behind the curtain

#artificialintelligence

Robotic process automation (RPA) can be the true antidote to manual, rote work, or it can be our worst nightmare if you listen to all the drama or the hype. RPA centers on the use of artificial intelligence (AI) to apply human-like thinking to streamline a typically manually intensive process or activity; and whether we like it or not, it's here to stay. Take, for instance, the process of data extraction from documents such as invoices. Application of advanced optical character recognition (OCR) and intelligent document recognition can automate a significant amount of the job of data entry typically performed by clerks or specialized data entry staff. Interestingly, human effort is still involved with attaining the ability to hand off a process or task to a machine.


Automating automation: Machine learning behind the curtain

#artificialintelligence

Robotic process automation (RPA) can be the true antidote to manual, rote work, or it can be our worst nightmare if you listen to all the drama or the hype. RPA centers on the use of artificial intelligence (AI) to apply human-like thinking to streamline a typically manually intensive process or activity; and whether we like it or not, it's here to stay. Take, for instance, the process of data extraction from documents such as invoices. Application of advanced optical character recognition (OCR) and intelligent document recognition can automate a significant amount of the job of data entry typically performed by clerks or specialized data entry staff. Interestingly, human effort is still involved with attaining the ability to hand off a process or task to a machine.